Search CORE

6 research outputs found

Clustering methods for big data analytics: techniques, toolboxes and applications

Author: Ben N'Cir Chiheb-Eddine
Nasraoui Olfa
Publication venue: Springer International Publishing
Publication date: 01/01/2018
Field of study

Generalization of c-means for identifying non-disjoint clusters with overlap regulation

Author: Ben N'Cir Chiheb-Eddine
Cleuziou Guillaume
Essoussi Nadia
Publication venue: 'Elsevier BV'
Publication date: 13/04/2014
Field of study

International audienceClustering is an unsupervised learning method that enables to fit structures in unlabeled data sets. Detecting overlapping structures is a specific challenge involving its own theoretical issues but offering relevant solutions for many application domains. This paper presents generalizations of the c-means algorithm allowing the parametrization of the overlap sizes. Two regulation principles are introduced, that aim to control the overlap shapes and sizes as regard to the number and the dispersal of the cluster concerned. The experiments performed on real world datasets show the efficiency of the proposed principles and especially the ability of the second one to build reliable overlaps with an easy tuning and whatever the requirement on the number of clusters

HAL - Normandie Université

Overview of efficient clustering methods for high-dimensional big data streams

Author: Ben N'Cir Chiheb-Eddine
Hassani M.
Nasraoui Olfa
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

The majority of clustering approaches focused on static data. However, a big variety of recent applications and research issues in big data mining require dealing with continuous, possibly infinite streams of data, arriving at high velocity. Web traffic data, surveillance data, sensor measurements, and stock trading are only some examples of these daily-increasing applications. Additionally, as the growth of data volumes is accompanied by a similar expansion in their dimensionalities, clusters cannot be expected to completely appear when considering all attributes together. Subspace clustering is a general approach that solved that issue by automatically finding the hidden clusters within different subsets of the attributes rather than considering all attributes together. In this chapter, novel methods for an efficient subspace clustering of high-dimensional big data streams are presented. Approaches that efficiently combine the anytime clustering concept with the stream subspace clustering paradigm are discussed. Additionally, efficient and adaptive density-based clustering algorithms are presented for high-dimensional data streams. Novel open-source assessment framework and evaluation measures are additionally presented for subspace stream clustering

Généralisation des k-moyennes pour produire des recouvrements ajustables

Author: Ben N'Cir Chiheb-Eddine
Cleuziou Guillaume
Essoussi Nadia
Publication venue: Hermann
Publication date: 29/01/2014
Field of study

Voir : http://editions-rnti.fr/?inprocid=1001932National audienceLa recherche de groupes non-disjoints à partir de données non-étiquetéesest une problématique importante en classification non-supervisée. Laclassification recouvrante (Overlapping clustering) contribue à la résolution deplusieurs problèmes réels qui nécessitent la détermination de groupes qui se chevauchent.Cependant, bien que les recouvrements entre groupes soient tolérésvoire encouragés dans ces applications, il convient de contrôler leur importance.Nous proposons dans ce papier des généralisations de k-moyennes offrant lecontrôle et le paramétrage des recouvrements. Deux principes de régulation sontmis en place, ils visent à contrôler les recouvrements relativement à leur tailleet à la dispersion des classes. Les expérimentations réalisées sur des jeux dedonnées réelles, montrent l’intérêt des principes proposés

HAL - Normandie Université